GEDEVO: An Evolutionary Graph Edit Distance Algorithm for Biological Network Alignment

نویسندگان

  • Rashid Ibragimov
  • Maximilian Malek
  • Jiong Guo
  • Jan Baumbach
چکیده

Introduction: With the so-called OMICS technology the scientific community has generated huge amounts of data that allow us to reconstruct the interplay of all kinds of biological entities. The emerging interaction networks are usually modeled as graphs with thousands of nodes and tens of thousands of edges between them. In addition to sequence alignment, the comparison of biological networks has proven great potential to infer the biological function of proteins and genes. However, the corresponding network alignment problem is computationally hard and theoretically intractable for real world instances. Results: We therefore developed GEDEVO, a novel tool for efficient graph comparison dedicated to real-world size biological networks. Underlying our approach is the so-called Graph Edit Distance (GED) model, where one graph is to be transferred into another one, with a minimal number of (or more general: minimal costs for) edge insertions and deletions. We present a novel evolutionary algorithm aiming to minimize the GED, and we compare our implementation against state of the art tools: SPINAL, GHOST, C-GRAAL, and MI-GRAAL. On a set of protein-protein interaction networks from different organisms we demonstrate that GEDEVO outperforms the current methods. It thus refines the previously suggested alignments based on topological information only. Conclusion: With GEDEVO, we account for the constantly exploding number and size of available biological networks. The software as well as all used data sets are publicly available at http://gedevo.mpi-inf.mpg.de. 1998 ACM Subject Classification F.2.2 Nonnumerical Algorithms and Problems

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CytoGEDEVO: A Cytoscape app for fast and interactive network alignment

Recent years have seen a huge amount of biological network data becoming available, including Protein-Protein-Interaction (PPI) networks, metabolic networks, and gene regulatory networks. However, analysis of this data still poses a problem, as many methods for analyzing networks and gaining insight require solving NP-hard problems. This thesis focuses on the global network alignment problem fo...

متن کامل

On Suboptimal Alignments of Biological Sequences

It is widely accepted that the optimal alignment between a pair of proteins or nucleic acid sequences that minimizes the edit distance may not necessarily re ect the correct biological alignment. Alignments of proteins based on their structures or of DNA sequences based on evolutionary changes are often di erent from alignments that minimize edit distance. However, in many cases (e.g. when the ...

متن کامل

Tree Edit Distance, Alignment Distance and Inclusion

We survey the problem of comparing labeled trees based on simple local operations of deleting, inserting and relabeling nodes. These operations lead to the tree edit distance, alignment distance and inclusion problem. For each problem we review the results available and present, in detail, one or more of the central algorithms for solving the problem.

متن کامل

A Branching Alignment-Based Synthesis of Regular Expressions

We propose a novel Multiple Sequence Alignment algorithm which is able to build an optimized branching graph given a set of positive matching sample strings. The algorithm is principally based on Minimum Edit Distance approach being applied incrementally. However, we essentially extended the set of edit operations. The newly added operations allow implementing an acyclic graph drawing feature. ...

متن کامل

A survey on tree edit distance and related problems

We survey the problem of comparing labeled trees based on simple local operations of deleting, inserting, and relabeling nodes. These operations lead to the tree edit distance, alignment distance, and inclusion problem. For each problem we review the results available and present, in detail, one or more of the central algorithms for solving the problem. keywords tree matching, edit distance

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013